Phylogeny Based on Whole Genome as inferred from Complete Information Set Analysis.
نویسندگان
چکیده
Previous molecular phylogeny algorithms mainly rely onmulti-sequence alignments of cautiously selected characteristic sequences,thus not directly appropriate for whole genome phylogeny where eventssuch as rearrangements make full-length alignments impossible. Weintroduce here the concept of Complete Information Set (CIS) and itsmeasurement implementation as evolution distance without reference tosizes. As method proof-test, the 16s rRNA sequences of 22 completelysequenced Bacteria and Archaea species are used to reconstruct aphylogenetic tree, which is generally consistent with the commonlyaccepted one. Based on whole genome, our further efforts yield a highlyrobust whole genome phylogenetic tree, supporting separate monophyleticcluster of species with similar phenotype as well as the early evolution ofthermophilic Bacteria and late diverging of Eukarya. The purpose of thiswork is not to contradict or confirm previous phylogeny standards butrather to bring a brand-new algorithm and tool to the phylogeny researchcommunity. The software to estimate the sequence distance and materialsused in this study are available upon request to corresponding author.
منابع مشابه
A More Accurate and Efficient Whole Genome Phylogeny
To reconstruct a phylogeny for a given set of species, most of the previous approaches are based on the similarity information derived from a subset of conserved regions (or genes) in the corresponding genomes. In some cases, the regions chosen may not reflect the evolutionary history of the species and may be too restricted to differentiate the species. It is generally believed that the infere...
متن کاملMeta-Analysis of General Bacterial Subclades in Whole-Genome Phylogenies Using Tree Topology Profiling
In the last two decades, a large number of whole-genome phylogenies have been inferred to reconstruct the Tree of Life (ToL). Underlying data models range from gene or functionality content in species to phylogenetic gene family trees and multiple sequence alignments of concatenated protein sequences. Diversity in data models together with the use of different tree reconstruction techniques, di...
متن کاملWhole-genome phylogeny of mammals: evolutionary information in genic and nongenic regions.
Ten complete mammalian genome sequences were compared by using the "feature frequency profile" (FFP) method of alignment-free comparison. This comparison technique reveals that the whole nongenic portion of mammalian genomes contains evolutionary information that is similar to their genic counterparts--the intron and exon regions. We partitioned the complete genomes of mammals (such as human, c...
متن کاملMolecular phylogeny of Scutellaria (Lamiaceae; Scutellarioideae) in Iranian highlands inferred from nrITS and trnL-F sequences
Scutellaria with about 360 species is one of the largest genera of Lamiaceae. The Iranian highlands accommodate about 40 Scutellaria spp., and is considered as one of the main centers of diversity of the genus. Here, we present a phylogenetic study for 44 species of Scutellaria especially from Iranian highlands, representing major subgeneric taxa, based on nuclear rib...
متن کاملHousehold Clustering of Escherichia coli Sequence Type 131 Clinical and Fecal Isolates According to Whole Genome Sequence Analysis
Background. Within-household sharing of strains from the resistance-associated H30R1 and H30Rx subclones of Escherichia coli sequence type 131 (ST131) has been inferred based on conventional typing data, but it has been assessed minimally using whole genome sequence (WGS) analysis. Methods. Thirty-three clinical and fecal isolates of ST131-H30R1 and ST131-H30Rx, from 20 humans and pets in 6 h...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Journal of biological physics
دوره 28 3 شماره
صفحات -
تاریخ انتشار 2002